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Abstract 

Quantile  regression  is  an  increasingly  important  tool  that  estimates  the  conditional  quantiles  of 
a  response  Y  given  a  vector  of  regressors  D.  It  usefully  generalizes  Laplace's  median  regression 
and  can  be  used  to  measure  the  effect  of  covariates  not  only  in  the  center  of  a  distribution,  but 
also  in  the  upper  and  lower  tails.  For  the  linear  quantile  model  defined  by  V  =  D''f{U)  where 
D'ylU)  is  strictly  increaising  in  U  and  [/  is  a  standard  uniform  variable  independent  of  D,  quantile 
regression  allows  estimation  of  quantile  specific  covariate  effects  j(t)  for  r  £  (0, 1).  In  this  paper, 
we  propose  an  instrumental  variable  quantile  regression  estimator  that  appropriately  modifies  the 
conventional  quantile  regression  and  recovers  quantile-specific  covariate  effects  in  an  instrumental 
variables  model  defined  by  V  =  D'a(U)  where  D'a{U)  is  strictly  increasing  in  U  and  [/  is  a  uniform 
variable  that  may  depend  on  D  but  is  independent  of  a  set  of  instrumental  variables  Z.  The  proposed 
estimator  and  inferential  procedures  are  computationally  convenient  in  typical  applications  and  can 
be  carried  out  using  software  available  for  conventional  quantile  regression.  In  addition,  the  proposed 
estimation  procedure  gives  rise  to  a  convenient  inferential  procedure  that  is  naturally  robust  to  weak 
identification.  The  use  of  the  proposed  estimator  and  testing  procedure  is  illustrated  through  two 
empirical  examples. 
Keywords:  Quantile  Regression,  Instrumental  Variables,  Weak  Instruments 


1.  Introduction 

Quantile  regression  is  an  important  tool  for  estimating  conditional  quantile  models  that  has  been 
used  in  many  empirical  studies  and  has  been  studied  extensively  in  theoretical  statistics;  see  Koenker 
and  Bassett  (1978),  Koenker  and  Portnoy  (1987),  Portnoy  (1991),  Gutenbrunner  and  Jureckova 
(1992),  Chaudhuri,  Doksum,  and  Samarov  (1997),  Portnoy  and  Koenker  (1997),  Knight  (1998), 
Koenker  and  Machado  (1999),  Portnoy  (2001),  and  He  and  Zhu  (2003).  One  of  quantile  regression's 


'The  Matlab  software  for  this  paper  is  available  upon  request  via  e-mail  chansenl@gsb.uchicago.edu. 
Further  updates  of  this  paper  can  be  downloaded  at  www.mit.edu/'vchern.  Address  correspondence  to  C. 
Hansen,  Asst.  Prof,  of  Econometrics  and  Statistics,  The  University  of  Chicago,  Graduate  School  of  Business, 
5807  South  Woodlawn  Avenue,  Chicago,  IL  60637,  USA,  chansenl@gsb.uchicago.edu. 

^Portions  of  this  paper  were  previously  included  in  MIT  Department  of  Economics  Working  Paper  02-07, 
"An  IV  Model  of  Quantile  Treatment  Effects",  2001. 


most  appealing  features  is  its  ability  to  estimate  quantile-specific  effects  that  describe  the  impact  of 
covariates  not  only  on  the  center  but  also  on  the  tails  of  the  outcome  distribution.  While  the  central 
effects,  such  as  the  mean  effect  obtained  through  conditional  mean  regression,  provide  interesting 
summary  statistics  of  the  impact  of  a  covariate,  they  fail  to  describe  the  full  distributional  impact 
unless  the  variable  affects  both  the  central  and  the  tail  quantiles  in  the  same  way.  In  addition, 
interest  focuses  on  the  impact  of  covariates  on  points  other  than  the  center  of  the  distribution  in 
many  cases.  For  example,  in  a  study  of  the  effectiveness  of  a  job  training  program,  the  effect  of 
training  on  the  low  tail  of  the  earnings  distribution  will  likely  be  of  more  interest  for  public  policy 
than  the  effect  of  training  on  the  mean  of  the  distribution. 

For  an  outcome  Y  and  set  of  factors  D  affecting  the  outcome,  the  conventional  linear  conditional 
quantile  model  may  be  defined  as 

(1.1)  Y  =  D'-yiU'),         t/*|£)~Uniform(0, 1), 

where  r  ►—>  D'f{T)  is  strictly  increasing  and  continuous  in  r.  Doksum  (1974)  interprets  the  dis- 
turbance U'  as  individual  ability  or  proneness.  By  construction,  D'^{t)  is  the  r-quantile  of  Y 
conditional  on  D.  This  model  generalizes  the  usual  linear  regression  model 

y  =  D'7o  +  7i(t/*) 

by  allowing  quantile-specific  effects  of  covariates  D.  For  a  given  quantile  indexed  by  t  €  (0,1), 
the  quantile  specific  effects  7(r)  can  be  estimated  using  standard  quantile  regression  methods  (e.g. 
Koenker  and  Bassett  (1978)). 

In  this  paper,  we  develop  a  new  estimation  method  for  an  endogenous  generalization  of  the 
above  model.  The  developed  approach  is  designed  for  settings  where  the  observed  variables  D  are 
determined  non-experimentally,  making  it  difficult  to  infer  the  true  structural/causal  effects  of  D  on 
the  outcomes.  Specifically,  we  consider  the  model 

(1.2)  Y  =  D'a{U),         C/|Z  ~  Uniform(0, 1), 

where  r  i— >  D'a{T)  is  strictly  increasing  in  r,  D  is  statistically  dependent  on  U ,  and  Z  \s  a,  set  of 
instrumental  variables  that  are  independent  of  U  but  statistically  related  to  D.  Since  D  depends  on 
U ,  the  sampled  D  will  depend  on  [/,  leading  to  biased  sampling  or  endogeneity.  This  endogeneity 
makes  7(t)  /  oi.{t),  rendering  conventional  quantile  regression  inconsistent  for  estimating  (1.2). 

For  example,  suppose  that  Y  is  the  hourly  wage  of  a  worker  and  that  D  is  an  individual's 
level  of  training.  The  unobserved  disturbance  U  would  reflect  imobserved  personal  characteristics, 
such  as  ability,  which  influence  the  individual's  wage  via  the  equation  Y  =  D'a(U).  If  high-ability 
individuals  choose  high  levels  of  training,  then  the  level  of  training  is  correlated  with  ability,  which 
causes  dependence  between  U  and  D  and  implies  that  conventional  quantile  regression  will  overstate 


the  true  effect  of  training  on  earnings,  7(t)  >  a{T).  Instrumental  variables  Z,  such  as  random 
assignment  to  training  programs  in  the  training  context,  allow  us  to  overcome  this  problem  by 
providing  a  source  of  variation  in  D  that  is  independent  of  U.  There  are  many  other  interesting 
examples  where  D  is  sampled  depending  on  U,  i.e.  endogenously.  The  empirical  section  presents  a 
supply-demand  example  and  a  training  example. 

Model  (1.2)  generaUzes  the  conventional  instrumental  variables  model  with  additive  disturbances 
Y  —  D'aQ  +  ai{U)  where  U\Z  ~  U{0, 1)  to  cases  where  the  impact  of  D  varies  across  quantiles  of  the 
outcome  distribution.  A  number  of  appealing  approaches  are  readily  available  to  estimate  Qq  in  the 
conventional  instrumental  variables  model  with  additive  disturbances,  including  the  conventional 
two-stage  least  squares  (2SLS)  estimator  and  its  robust  analogs  by  Amemiya  (1982)  and  Chen  and 
Portnoy  (1996). 

The  purpose  of  this  paper  is  to  provide  practical  estimation  and  inference  methods  for  model 
(1.2).  The  estimator  we  propose  is  an  appealing  modification  of  the  standard  quantile  regression 
that  can  be  constructed  from  a  series  of  conventional  quantile  regressions.  Thus,  the  estimation 
approach  is  computationally  convenient  and  simple  to  implement  in  many  typical  applications.  It 
has  already  been  used  in  empirical  applications,  e.g.  Hausman  and  Sidak  (2004),  Januszewski 
(2004),  and  Chernozhukov  and  Hansen  (2004a),  and  will  be  further  illustrated  with  two  empirical 
applications  in  this  paper.  In  addition,  the  estimation  procedure  leads  naturally  to  an  inference 
procedure  that  will  be  valid  even  when  one  of  the  key  conditions  for  identification  of  the  model,  that 
D  is  statistically  dependent  on  Z,  fails. 

The  remainder  of  this  paper  is  organized  as  follows.  In  Section  2,  we  define  the  model  in  more 
detail  and  allow  for  other  controls  in  the  equations.  Section  3  discusses  estimation  and  testing 
procedures  based  on  a  set  of  moment  equations  introduced  in  Section  2.  Section  4  illustrates  the 
use  of  the  derived  estimator  and  testing  procedure  through  brief  empirical  examples,  and  Section  5 
concludes. 

2.  The  Instrumental  Quantile  Regression  Method 
2.1.  The  Model 

In  this  section,  we  more  formally  define  the  model  we  will  estimate.  Suppose  we  have  a  structural 
relationship  defined  by 

(2.1)  Y  =  D'a{U)  +  X'0{U),         [/|X,  Z  ~  Uniform(0, 1), 

(2.2)  D  =  S{X,  Z,  V),  where  V  is  statistically  dependent  on  U, 

(2.3)  T  i->  D'aij)  -1-  X'0{t)  strictly  increasing  in  r. 


In  these  equations, 

•  y  is  the  scalar  outcome  variable  of  interest, 

•  [/  is  a  scalar  random  variable  that  aggregates  all  of  the  unobserved  factors  affecting  the 
structural  outcome  equation, 

•  Z3  is  a  vector  of  endogenous  variables  determined  by  (2.2),  where 

•  V  is  a  vector  of  unobserved  disturbances  determining  D  and  correlated  with  U, 

•  2  is  a  vector  of  instrumental  variables  (control  variables  excluded  from  (2.1)  that  are  inde- 
pendent of  the  disturbance  U  but  impact  variable  D  via  (2.2)),  and 

•  X  is  a  vector  of  included  control  variables. 

The  observed  variables  consist  of  [Y,  D,  X,  Z),  and  due  to  the  dependence  between  V  and  U,  D  is 
also  sampled  depending  on  U  . 

We  shall  refer  to  the  function 

(2.4)  Sy  (r  |d,  x)  =  d'a{T)  +  x'P{t) 

as  the  Structural  Quantile  Function  (SQF)  in  order  to  emphasize  that  it  is  in  general  a  different 
object  than  the  conditional  quantile  function  Qy  (r|d,  x).  The  structural  quantile  function  5y(T|d,  x) 
describes  the  quantile  function  of  the  latent  outcome  variable  Yd  =  d'a{U)  +  X'0{U)  obtained  by 
fixing  D  =  d  and  sampling  the  disturbance  U  ~  t/(0, 1)  (all  conditional  on  X).  This  notion  of 
sampling  corresponds  to  independent  sampling  of  D  and  U,  which  is  generally  not  feasible  outside 
experimental  settings.  Instead  the  sampled  variable  D  is  determined  via  (2.2).  Nevertheless,  it  is 
still  possible  to  estimate  or  make  inference  on  the  structural  quantile  function  5y(r|d,  i)  through 
the  use  of  instrumental  variables  Z  which  induce  variation  in  D  but  are  themselves  independent  of 
U. 

2.2.  The  Principle 

Under  the  conditions  of  (2.1)  and  (2.3),  the  problem  of  dependence  between  U  and  D  is  overcome 
through  the  presence  of  instrumental  variables,  Z,  that  affect  the  determination  of  D  but  are  in- 
dependent of  U.  In  program  evaluation  studies  with  imperfect  compliance,  a  simple  example  of  an 
instrument  is  random  assignment  to  the  treatment  group,  which  is  done  independently  of  the  poten- 
tial values  of  U.  The  presence  of  the  instrumental  variable  leads  to  a  set  of  moment  equations  that 
can  be  used  to  estimate  the  parameters  of  (2.1).  From  (2.1)  and  (2.3),  the  event  {Y  <  Sy{t\D,  X)} 
is  equivalent  to  the  event  {U  <  r}.  It  then  follows  from  (2.1)  that 

(2.5)  P[Y<Sy{t\D,X)\Z,X]  =  t. 


Equation  (2.5)  provides  a  useful  statistical  restriction  that  can  be  used  to  estimate  the  structural 
parameters  a  and  /3.  It  is  important  to  notice  that  the  equation  P[Y  <  5y  (r|Z),  X)\Z,  X]  =  t  differs 
from  the  conventional  estimating  equation 

(2.6)  P[Y  <Qy(t\D,X)\D,X]=t 

used  to  estimate  the  conditional  quantile  function  of  Y  given  D  and  X . 

Recall  from  Koenker  and  Bassett  (1978)  that  the  ordinary  quantile  regression  (QR)  is  formulated 
8is  finding  the  best  predictor  of  Y  given  W  under  the  asymmetric  least  absolute  deviation  loss 
Pt-(u)  '■=  {t  —  l(w  <  0))u.  In  other  words,  assuming  integrability,  the  r-th  conditional  quantile  of  F 
given  W  solves  the  problem: 


(2.7)  Q^(r|TV)  =  argmm  E[pAY-f{W))] 


where  J^  is  the  class  of  measurable  functions  of  W  (restricted  in  applications  to  be  a  set  of  flexible 
parametric  functions).  Laplace's  median  regression  function  Qy{-5\W)  is  a  solution  of  this  problem 
with  T  =  1/2  so  that  Pt(w)  =  ||m|.  The  function  Qy{t\D,X)  solves  the  above  problem  with 
W  =  {D,X)  and  can  be  estimated  using  the  finite  sample  analog  of  the  above  equation. 

The  moment  equation  given  in  (2.5)  is  equivalent  to  the  statement  that  0  is  the  r-th  quantile 
of  random  variable  Y  —  Sy{t\D,X)  conditional  on  (Z,X): 

(2.8)  Q  =  Qy-Sy{t\d,x){t\Z,X)      a.s.  for  each  r. 

Thus,  we  may  pose  the  problem  of  finding  Q(r)  and  /?(r)  solving  equation  (2.5)  as  the  instrumental 
variable  or  inverse  quantile  regression  (IVQR).  This  problem  is  to  find  an  5y(r|Z),X)  such 
that  0  is  a  solution  to  the  quantile  regression  of  K  —  Sy{t\D,  X)  on  {Z,  X): 


(2.9)  0  =  argmin  Epr\{Y  -  Sy[t\D^X)  -  f{Z,X))] , 


where  J^  is  the  class  of  measurable  functions  of  {X,  Z)  (which  will  be  restricted  in  applications).  The 
term  'inverse'  emphasizes  an  evident  inverse  relation  of  this  problem  to  the  conventional  quantile 
regression,  (2.7). 


3.  The  Instrumental  Quantile  Regression  Estimator  and  Derived  Dual  In- 
ference 

3.1.  Basic  Description  and  Properties 

Next  we  consider  a  finite-sample  analog  of  the  above  procedure.  Define  the  (weighted)  conventional 
quantile  regression  objective  function  as 

1    " 
Q„(r,Q,/3,7)  :=  -Tpr{Y,  -  D[a  -  X[l3  -  Z'a)Vu 
n  "■ — ' 
t=i 

where  Z)  is  a  dim(Q)-vector  of  endogenous  variables,  X  is  a  dim(/3)- vector  of  exogenous  explanatory 
variables,  Zj  =  /(Xj,  Z,)  is  a  dim(7)- vector  of  instrumental  variables  such  that  dim(7)  >  dim(Q), 
and  Yi  :=  V{Xi,  Zt)  >  0  is  a  scalar  weight.  In  practice,  a  simple  procedure  is  to  set  K  =  1  and  let 
Zi  either  be  Z,  or  the  predicted  value  from  a  least  squares  projection  of  Di  on  Zi  and  X,. 

The  instrumental  variable  or  inverse  quantile  regression  estimator  (IVQR)  is  defined  as 
follows.  For  a  given  value  of  the  structural  parameter,  say  q,  let  us  run  the  ordinary  QR  to  obtain 

(31)  (^(Q,r),7(Q,r))  :=argmin   Qn{T,a,P,'y). 

^  '  '  0n 

To  find  an  estimate  for  a{T),  we  will  look  for  a  value  a  that  makes  the  coefficient  on  the  instrumental 
variable  j{a,  t)  as  close  to  0  as  possible.  Formally,  let 

(3.2)  S(t)  =  arg  inf  [Wn(a)] ,    Wn{a)  :=  n[7(Q,  T)']I(Q)[7(Q,r)], 

where  ^(q)  =  A(a)  +  Op(l)  and  A{o.)  is  positive  definite,  uniformly  in  a  ^  A.  It  is  convenient  to 
set  A{a)  equal  to  the  inverse  of  the  asymptotic  covariance  matrix  of  -^71(7(0:,  t)  —  7(0,  r))  in  which 
case  W„{a)  is  the  Wald  statistic  for  testing  7(a,T)  =  0,  a  fact  that  we  will  use  below  for  inference 
about  a{T)  itself.  The  parameter  estimates  are  then  given  by 

(3.3)  9{t)  :=  (5(t),  J3(T))  :=  (q(t),  ^(S(t),  r))  . 

The  estimator  (3.3)  is  a  finite-sample  instrumental  variable  quantile  regression.  Analogous  to  the 
population  problem  (2.8),  it  finds  parameter  values  for  a  and  0  through  the  inverse  step  (3.2)  such 
that  the  value  of  coefficient  7(0,  r)  on  Z  in  the  ordinary  quantile  regression  step  (3.1)  is  driven  as 
close  to  zero  as  possible.  This  estimator  is  consistent  and  asymptotically  normal  under  appropriate 
regularity  and  identification  conditions: 

(3.4)  ^.(O{T)-0{T))^dNiO,ne), 

for  Q,g  specified  below.  This  asymptotic  distribution  can  be  used  for  conducting  direct  inference  on 
the  parameter  of  interest  using  standard  Wald  procedures. 


In  addition,  we  can  base  inference  on  the  "dual"  Wald  statistic  Wn{a)  for  testing  whether  the 
coefficients  on  the  instruments  are  zero  (i.e.  whether  7(a,r)  =  0).  When  a  =  Q(r),  Wn{a)  is 
asymptotically  chi-squared  with  dim(7)  degrees  of  freedom: 

(3.5)  Wr,ia{T))^d  X'(dim(7)) 

Thus,  a  valid  confidence  region  for  a{T)  can  also  be  based  on  the  inversion  of  this  dual  Wald  statistic: 

(3.6)  CRp[a{T)]  :=  {q  :  Wn{a)  <  Cp}  contains  a{T)  with  probability  approaching  p, 

where  Cp  is  the  p-percentile  of  a  ,\^(dim(7))  distribution.  This  dual  approach  is  vaUd  under  weaker 
assumptions  than  the  direct  approach;  in  particular,  it  is  robust  to  weak  or  partial  identification  of 
a{T).  Section  3.5  discusses  the  properties  of  the  dual  procedure  in  more  detail. 

For  a  given  probability  index  r  of  interest,  the  estimator  may  be  computed  in  practice  as  follows: 

1.  Define  a  suitable  set  of  values  {aj,j  =  1, ...,  J},  and  run  the  ordinary  r-quantile  regression 
of  Fi  —  D[aj  on  Xi  and  Zi  to  obtain  coefficients  0{aj,T)  and  7(Qj,r). 

2.  Save  the  inverse  of  the  variance-covariance  matrix  of  7(q_;,t),  which  is  readily  available  in 
any  common  implementation  of  the  ordinary  QR,  to  use  as  A(aj)  in  Wn{aj).  Then  Wnicij)  becomes 
a  Wald  or  F-statistic  for  testing  7(Qj,r)  =  0,  depending  on  the  naming  convention. 

3.  Choose  3(r)  as  a  value  among  {aj,j  =  1, ...,  J}  that  minimizes  Wr,{a).  The  estimate  of /?(t) 
is  then  given  by  /?(S(t),t). 

4.  Direct  inference  on  Q(r)  may  be  conducted  using  the  variance  formula  for  Qg  provided  below. 
Dual  confidence  regions  for  q(t),  CRp,  may  be  computed  as  CRp[a{T)]  =  {aj  :  Wn{aj)  <  Cp},  and 
its  upper  and  lower  bounds  may  be  used  as  end-points  of  a  confidence  interval  for  a(T). 

3.2.  Computational  Complexity  and  Implementation 

One  of  the  most  appealing  features  of  the  IVQR  and  associated  dual  inference  confidence  region  is 
that  both  may  be  computed  using  the  output  from  the  conventional  QR  using  any  modern  software. 
Portnoy  and  Koenker  (1997)  show  that  ordinary  QR  can  be  computed  in  polynomial  stochastic  time 
Op  (dim(/3,  7)^  X  71^+"*)  using  interior  point  algorithms  with  preprocessing,^  so  the  above  IVQR 
procedure  has  computational  complexity  of  Op((l/e)'^™'°'  x  dim(/?,  7)^  x  n^+*)  for  a  desired  level 
of  accuracy  e  and  some  6  >  0.  Since  we  need  e  ex  l/n",  a  >  1/2,  and  it  suffices  to  have  a  =  1/2  +  5' , 
for  some  small  5'  >  0,  the  proposed  algorithm  has  computational  complexity 


Op  (^n(i/2+5')dim(a)  ^  dim(/3,7)'  x  n'+') 


■'in  contrast,  simplex  procedures  will  have  running  time  of  Op  f  n''""'"'''' j. 


that  is  polynomial  in  the  sample  size  n  and  in  the  dimension  of  {P,j),  but  is  not  polynomial  in  the 
dimension  of  a.  Thus,  the  procedure  will  be  computationally  attractive  and  work  well  when  the 
number  of  exogenous  variables,  dim(/5),  is  possibly  large,  but  the  number  of  endogenous  variables, 
dim(Q),  is  small.  This  situation  is  certainly  the  most  common  case  prevalent  in  econometric  speci- 
fications, where  typically  dim(Q)  =  1  or  2  and  dim(/3)  varies  from  1  to  50  or  more.  In  fact,  due  to 
its  practical  properties,  the  estimator  has  already  been  applied  in  empirical  analysis  by  Hausman 
and  Sidak  (2004),  Januszewski  (2004),  and  Chernozhukov  and  Hansen  (2004a),  and  this  paper  also 
presents  two  additional  empirical  applications. 

There  are  other  approaches  that  one  could  adopt  for  estimation  of  the  model  defined  in  Section 
2.1.  For  example,  an  immediate  approach  is  the  method  of  moments  approach  (MM)  that  attempts 
to  minimize  \\^T,'^^i(liYi  <  D[a  +  X[P)  -  T){X[,Z';)'Vi\\  over  a  and  (5.  Another  example  is 
the  estimator  of  Sakata  (2001)  which  is  an  elegant  maximum  likelihood  type  estimator  based  on 
the  absolute  deviation.''  In  contrast  to  the  IVQR  approach,  these  alternative  approaches  involve 
highly  non-convex,  multi-modal,  and  non-smooth  objective  functions  over  many  parameters,  which 
poses  a  serious  computational  challenge.  Implementation  of  extremum  estimators  with  non-smooth 
and,  more  importantly,  non-convex  objective  functions  generally  requires  non-convex  searches  over 
parameter  sets  of  dimension  K  =  dim(/3)-l-dim(a),  which  will  be  quite  large  in  many  cases  due  to  the 
high-dimension  of  (3.  Thus,  the  IVQR  approach  will  have  an  advantage  when  dim(Q)  is  small,  as  is 
the  case  in  many  applications.  When  dim(Q:)  is  high,  both  the  IVQR  approach  and  the  MM  approach 
become  difficult  to  implement.  In  such  settings  one  could  use  the  quasi- Bayesian  methods  for  MM 
developed  in  Chernozhukov  and  Hong  (2003).  This  approach  computes  estimates  and  confidence 
intervals  using  a  quasi-posterior  defined  as  the  exponent  of  the  MM  function  specified  above. 

4.  Asymptotic  Distribution  Theory 
4.1.  Assumptions 

To  state  the  assumptions,  define  the  population  objective  function  as 

(4.1)  Q{T,a,p,i):=E[pr{Y,-D'^a~X[p-Z[-t)Vi\, 
and  let 

(4.2)  (/3(a,r),7(Q,T))  :=  argmin   Q{T,a,P,'y). 


In  our  notation,  the  estimator  solves  the  following  program; 

Y,  \y'  -  D'ia  -  X[0  -  Z'n  -  X[&\/  f2  \y^  -  ^-a  -  XlP\ 


max  mm 

a,0     7,(5 


Define  the  parameter  space  Q  =  Ax  B  as  a,  compact  convex  set  such  that  B  contains  the  population 
value  P{a,  t)  for  each  q  e  ^  in  its  interior,  so  that  the  parameter-on-the-boundary  problem  does 
not  arise.  We  assume  the  data  were  generated  by  the  model  defined  in  Section  2  and  impose  the 
following  additional  assumptions: 

Rl  {Yi,  Di,  Xi,  Zi)  are  iid  defined  on  the  probability  space  (H,  F,  P)  and  have  compact  support. 

R2  For  the  given  t,  (Q(r),/3(T))  is  in  the  interior  of  the  specified  set  0. 

R3  Density  Jy{Y\X,  D,  Z)  is  bounded  by  a  constant  /  a.s. 

R4  dE[l{Y  <  D'a  +  X'0  +  Z''i)<i!]/d[P',i)  has  full  rank  at  each  a  in  A,  for  *  =  V,{Z[,X'^' 

The  compactness  conditions  in  Rl  and  R2  simplify  the  analysis.  The  bounded  density  in  R3  and 
compactness  condition  in  Rl  are  sufficient  for  the  Jacobian  matrix  in  R4  to  be  well-defined.  The  full 
rank  condition  in  R4  and  iid  sampling  suffice  for  the  estimates  (/?(Q;,r),7(a,  r))  to  be  asymptotically 
normal  and  are  sufficient  for  implementing  the  dual  inference.  It  is  important  to  note  that  R1-R4 
do  not  impose  any  conditions  on  the  relation  between  D  and  Z\  that  is,  unlike  the  direct  inference 
procedure,  the  dual  procedure  will  be  valid  when  identification  is  weak  or  fails  partially  or  completely. 

Stronger  additional  conditions  are  imposed  for  implementing  the  direct  inference. 

R5    dE[l(Y  <  D'a  +  X'/3)*)']/5(a',/3')  has  full  rank  at  (a(r)',/3(r)')'. 

R6    The  function  {a,  (5)  ^-^  E[{t  -  1(7  <  D'a  +  X'/3)}*]  is  one-to-one  over  0. 

The  imposition  of  R1-R6  is  sufficient  for  identification  and  asymptotic  normality  of  the  IVQR 
estimator,  both  of  which  are  necessary  for  the  validity  of  the  direct  inference  approach.  These  as- 
sumptions considerably  strengthen  the  conditions  R1-R4  by  imposing  restrictions  on  the  relationship 
between  D  and  Z .  The  dual  approach  does  not  require  these  assumptions  for  its  validity.  Hence, 
the  dual  approach  is  robust  to  the  violation  of  either  R5  or  R6. 

To  further  comment  on  the  nature  of  correlation  between  Z  and  D  required  by  R5,  note  that 
by  Rl  and  R3  we  have  that 

dE[l{Y  <  D'a  +  X'l5m']ld{a' ,0')  =  E[fMX,D,Z)V,{Z[,Xi)'{D',,X'^)]. 

Hence,  if  we  set  Vj  =  1  for  simplicity,  we  see  that  the  Jacobian  in  R5  takes  a  form  of  density- weighted 
covariation  matrix  between  D  and  Z,  and  R5  requires  that  this  matrix  has  full  rank.  R6  imposes 
that  global  identifiability  must  hold;  hence,  the  impact  of  Z  should  be  rich  enough  to  guarantee 
that  the  moment  equations  are  solved  uniquely.  These  assumptions  are  required  to  carry  out  direct 
inference  but  are  not  required  in  the  dual  approach.  Thus,  discrepancies  between  the  dual  approach 
and  the  direct  approach  should  be  indicative  of  situations  where  R5  and  R6  do  not  hold. 
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Before  proceeding  to  the  asymptotic  results,  we  provide  some  sufficient  and  more  primitive 
conditions  for  the  global  identifiability  condition  R6.  A  set  of  conditions  that  suffices  for  both  R5 
and  R6  is  as  follows: 

R5'  dE[l{Y  <  D'a  +  X'p)md{a\(5')  has  full  rank  at  each  {q,P)  in  G. 

R6'  The  image  of  G  under  (q,^)  ^  E\{t-\{Y  <  D'a  +  X'(3))^  is  simply-connected. 

R5'  ensures  that  the  mapping  (a, /?)  i— >  E[{t  -  1{Y  <  D'q  +  X'/3)}^]  is  locally  one-to-one 
everywhere.  The  simple-connectivity  condition  R6'  curbs  somewhat  the  non-linearity  of  the  mapping 
and  implies  a  global  one-to-one  relationship  by  a  Plastock-Hadamard  type  result,  cf.  Ambrosetti 
and  Prodi  (1995).  This  fact  and  equations  (2.4)  and  (2.5)  imply  that  the  solution  of  the  equation 

E  \{t  -  l(r  <  D'a  +  X'/J)}*]  =  0 

is  unique  and  is  given  by  (q(t), /3(r)). 

Other  sufficient  and  more  primitive  conditions  for  R5  and  R6  also  result  through  an  application 
of  Theorem  2  in  Mas-Colell  (1979).  Let  G'  be  a  convex  compact  set  that  contains  9  and  that  has  a 
smooth  boundary  dQ' .  Then  the  following  conditions  imply  R5'  and  R6'  and  hence  R5  and  R6. 

R5*  dE[\{Y  <  D'a  +  X'P)<i']/dia',0')  has  a  positive  determinant  at  each  (q,/3)  in  9'. 
R6*  dE  [l(y  <  D'a  +  X'/3)*]  /d{a',  0')  is  positive  quasi-definite  along  the  boundary  dQ'  in  the 
sense  defined  by  Mas-Colell  (1979). 

Note  that  in  the  exogenous  model,  we  can  set  Zi  =  Di,  and  these  conditions  will  be  trivially  satisfied. 

4.2.  Asymptotic  Properties  of  the  Dual  Inference 

We  first  state  the  formal  results  for  dual  inference,  because  they  are  the  simplest  to  explain.  Under 
the  conditions  R1-R4,  as  n  — >  oo,  uniformly  in  a  €  ^ 

n 

(4.3)  M^{a,r)  -  ^a,T))  =  -J^'(a)  ■  n-^'^  -^3,(0)  +  Op(l), 

1=1 

(4.4)  s,(q)  =  [r  -  1  (£,(q)  <  0)1  *„  *i  =  V,{X\,  Z'^}' , 

(4.5)  Eiia)  =  Y-  Aq  -  ^.'/?(q,  t)  -  Zh(a,  t), 

(4.6)  Ma)  =  E[f,^^^iO\Z,X)<i>9'/V]. 

(4.3)-(4.6)  follow  by  adopting  standard  arguments  for  the  quantile  regression  process.  The  difference 
here  is  that  we  have  a  process  in  a,  whereas  we  usually  have  a  process  over  r.  Hence  for  each  a 

(4.7)  V^p(a,r)-^(Q,r))-.d  Af(0,ntf[a]), 

(4.8)  Q4a]  =  J-'[a]S[a]J-'[a],    5[a]  =  £[s,(q)s,(q)'1. 
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The  statistic  for  testing  'y(a,r)  =  0  is  given  by  the  Wald  statistic  Wn{a)  =  n[7(a,T)']r2tf  [Q][7(a,T)], 
where  Q,^){a]  =  n^[Q]  +Op(l)  is  any  standard  consistent  estimate  of  the  asymptotic  variance  (4.8)  of 
the  ordinary  QR.  Therefore,  when  a  =  q(t) 

(4.9)  FKn[Q(T)]-d  X'(dim(7)), 

and  for  the  confidence  region  CRp[a{T)]  :=  {a  €  A  :  Wn{a)  <  Cp},  where  P{x^(dim(7))  <  Cp}  =  p, 

(4.10)  P{a(T)  e  CRp[a{T)]}  =  P{Wr,[a{T)]  <  Cp}  -  p. 
Proposition  1.    Under  conditions  R1-R4,  the  results  (4-3)-(4-10)  are  true. 

Comment  1.  Unlilce  direct  inference,  the  dual  inference  requires  only  assumptions  R1-R4  to  hold. 
The  results  for  dual  inference  are  also  straightforward  to  extend  in  various  direction.  In  particular, 
one  can  note  that  the  preliminary  estimation  of  weights  Vi  and  instruments  Zi  will  not  affect  (4.9) 
or  even  (4.7)  as  long  as  q  is  in  a  root-n  neighborhood  of  a{T).  Additional  regularity  conditions  on 
the  estimates  of  weights  and  instruments  that  must  be  imposed  can  be  found  in  Andrews  (1994). 

4.3.  Asymptotic  Properties  of  the  Direct  Inference 

The  following  proposition  presents  the  asymptotic  properties  of  the  direct  approach. 

Proposition  2.  In  the  specified  model  under  conditions  R1-R4  and  conditions  R5-R6,  sufficient 
conditions  for  which  are  either  R5'-R6'  or  R5*-R6*, 

(4.11)  v^p(T)-e(T))^d7V(0,ne),     ne  =  {K',L'YSiK',L'), 

where,  for  <b  =  V-[X',Z']'  and  e  =  Y -D'a{T)-X' 0{t),  S  =  t{1-t)E  [^^'] ,  K  =  {J'^HJa)-^J'^H, 
H  =  J;yl[Q(T)]J.,,  L  =  JfiM,  M  =  h+r  -  JaK,  J^  =  E[f,{Q\X,Z,D)^D'],  and  [J'^J'..,]'  is  a 
partition  of  J^^  :=  [E  [fc{0\X,  Z)'i/'i!'/V])~^  such  that  Jg  is  a  dim(/3)  x  dim(/?,7)  matrix  and  J~,  is 
a  dim  (7)  x  dim(/3, 7)  matrix. 

Corollary  1.  When  dim(7)  =  dim(a),  the  choice  of  A{t)  does  not  affect  asymptotic  variance,  and 
the  joint  asymptotic  variance  of  S(r)  and  P{t)  will  generally  have  the  simple  form 

ne  =  Jg'S{J'e)-' 

for  S  defined  above  and  Jg  =  E[f,(0\X,  Z,  D)'i[D',  X']]. 

Corollary  2.  When  dim(7)  >  dim{a),  the  choice  of  the  weighting  matrix  A(a)  in  the  objective 
function  Wn(a)  generally  matters.  A  natural  choice  for  A{a)  is  given  by  A{a)  =  ([n^[a]22)~^  which 
corresponds  to  the  inverse  of  the  covariance  matrix  of  ^/n{jy{a,T)  -  7(0,  r)).  Noting  that  A{a)  is 
equal  to  {Jn,SJ!y)~^  at  q(t),  it  follows  that  the  asymptotic  variance  of  y/n{a(T)  —  Q(r))  equals 
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Corollary  3.  The  efficient  score  for  (Q:(T),/3(r))  is  given  by  ^,j'_^Jt  —  l(e  <  0)]^',  where  9'  = 
y-[X',Z*']',  Z'  ■-E[D-v'\Z,X]/V',v*  ■-  f,{0\D,Z,X),  andV  =  f,{0\Z,X).  Thus,  if  Z  =  Z' 
andV  =  V,  the  asymptotic  variance  of{a{r),P{T))  attains  the  efficiency  6ounrfr(l— t)£['I''^*']~' . 

Comment  2.  Corollary  1  is  especially  convenient  since  the  variance  formula  becomes  simple  once 
the  instrument  Z  is  collapsed  to  the  same  dimension  as  D.  Corollary  3  shows  how  to  construct 
the  instrument  Z  and  weight  V  such  that  the  IVQR  estimator  achieves  the  efficiency  bound  in  the 
sense  defined  by  Amemiya  (1977)  as  well  as  the  semi-parametric  efficiency  bound  in  the  sense  of 
Bickel,  Klaassen.  Ritov.  and  Wellner  (1993).  Efficient  estimation  can  be  implemented  in  two  steps. 
In  the  first  step,  IVQR  is  used  to  obtain  residuals  ?i.  In  the  second  step,  the  required  weights 
V'  and  instruments  Z'  are  estimated  using  nonparametric  or  parametric  methods  and  are  used 
in  IVQR  again.  It  can  be  shown  that  estimation  of  weights  and  instruments  hsis  no  effect  on  the 
limit  distribution  of  the  estimators,  provided  additional  regularity  conditions,  found  e.g.  in  Andrews 
(1994),  on  the  estimates  of  weights  and  instruments  hold. 

4.4.  Estimating  Variance  and  Jacobian  Matrices 

The  components  of  the  variance  matrices  that  we  need  to  estimate  include  J^,  Ja,  and  S  for  direct 
inference  and  J;j[q]  and  S[a]  for  dual  inference.  Following  Koenker's  (1994)  analysis  for  ordinary 
QR,  the  first  set  of  components  can  be  estimated  as  follows: 

5=-V?,S:,    J^  =  -TlK{er/h)/h]<i,D[,    Jo  =  -y^[K{ulh)lh\^,%/Vu 
n  '■ — '  n  '■ — '  n  ^ — ' 

1=1  1=1  t=i 

where  ?,  :=  Y,  -  D[a(T)  -  X'^{t),  s;  =  [r  -  1  (e,  <  0)]  *,,  *,  =  V,[Z[,  X,']',  /i  is  a  bandwidth  chosen 
such  that  /i  ^  0  and  nh^  — »  oo,  and  K{-)  is  a  kernel  function.  Specific  choices  of  h  are  discussed  in 
Koenker  (1994).  Similarly,  the  second  set  of  estimates  is  given  by 

S\a]  =  -J2  S-.MJ.H'-    -Asia]  =  -  E  [K{e^[a]/h)/h]  <i^,<i!yVi 
1=1  t=i 

where  u\oc\  :=  Y^  -  D[a  -  X,'^(t,  a)  -  Z'^{a,  t)  and  s,[a\  =  [r  -  1  {e,\a]  <  0)]  *.,*,  =  V,[Z,',  X^]',  h 
is  a  bandwidth  chosen  such  that  h  —*  0  and  nh'^  — >  cxd,  and  K{-)  is  a  kernel  function.  The  consistency 
properties  of  these  estimators  are  standard  and  will  not  be  discussed  here. 


5.  Empirical  Examples 

In  this  section,  we  present  two  applications  of  the  estimation  and  inference  results  derived  in  Section 
3.  The  first  application  reports  the  results  of  a  simple  analysis  of  the  demand  for  fish.  This 
application  makes  use  of  a  small  sample  and  illustrates  the  potential  differences  between  the  direct 
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and  dual  inference  procedures.  In  the  second  example,  we  consider  the  effects  of  a  job  training 
program  on  earnings.  In  this  case,  the  identification  is  quite  strong,  and  we  see  small  differences 
between  the  direct  and  dual  inference  procedures.  The  results  here  also  demonstrate  the  bias  in 
the  conventional  quantile  regression  under  endogeneity.  In  particular,  the  conventional  quantile 
regression  estimates  indicate  that  the  effect  of  training  is  positive  and  significant  across  the  entire 
outcome  distribution,  while  the  IVQR  estimates  indicate  that  the  training  impact  is  close  to  zero  in 
the  lower  tail  of  the  outcome  distribution. 

5.1.  Denaand  for  Fish 

In  this  section,  we  present  estimates  of  demand  elasticities  which  may  potentially  vary  with  the 
level  of  demand.  The  data  contain  observations  on  price  and  quantity  of  fresh  whiting  sold  in  the 
Fulton  fish  market  in  New  York  over  the  five  month  period  from  December  2,  1991  to  May  8,  1992. 
These  data  were  used  previously  in  Graddy  (1995)  to  test  for  imperfect  competition  in  the  market. 
The  price  and  quantity  data  are  aggregated  by  day,  with  the  price  measured  as  the  average  daily 
price  and  the  quantity  as  the  total  amount  of  fish  sold  that  day.  The  total  sample  consists  of  111 
observations  for  the  days  in  which  the  market  was  open  over  the  sample  period. 

For  the  purposes  of  this  illustration,  we  focus  on  a  simple  Cobb-Douglas  random  demand  model 
with  non-additive  disturbance: 

ln(Qp)  =  Qo(t/)  +  ai(;7)ln(p), 

where  Qp  is  the  quantity  that  would  be  demanded  if  the  price  were  p,  U  is  an  unobservable  affecting 
the  level  of  demand  normahzed  to  follow  (7(0, 1),  and  a\{U)  is  the  random  demand  elasticity  when 
the  level  of  demand  is  U.  A  supply  function  S^  =  /(p,  Z,U)  describes  how  much  producers  would 
supply  if  the  price  were  p,  subject  to  other  factors  Z  and  unobserved  disturbance  U.  The  factors  Z 
affecting  supply  are  assumed  to  be  independent  of  demand  disturbance  U . 

The  observed  quantity  Y  sold  in  the  market  is  given  in  logs  by 

In  y  =  Qo  (t/)  -H  ai  ([/)  In  P,  where 
(5.1) 

U   is  independent  of  Z, 

where  P  is  the  price  picked  by  the  market  to  equate  supply  and  demand.  That  is,  P  satisfies 
Oio{U)  +  ai([/)ln(P)  =  \n  f{P,  Z,U),  which  implies  the  observed  price  depends  on  the  demand 
disturbance  U,  i.e.  that  P  =  6(Z,  U,U)  for  some  function  6. 

As  instruments  Z,  we  consider  two  different  variables  capturing  weather  conditions  at  sea: 
Stormy  is  a  dummy  variable  which  indicates  wave  height  greater  than  4.5  feet  and  wind  speed 
greater  than  18  knots,  and  Mixed  is  a  dummy  variable  indicating  wave  height  greater  than  3.8 
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feet  and  wind  speed  greater  than  13  knots.  These  variables  are  plausible  instruments  since  weather 
conditions  at  sea  should  influence  the  amount  offish  that  reaches  the  market  but  should  not  influence 
demand  for  the  product.^  Simple  OLS  regressions  of  the  log  of  price  on  these  instruments  suggest 
they  are  correlated  to  price,  yielding  R^  and  F-statistics  of  0.227  and  15.83  when  both  Stormy  and 
Mixed  are  used  as  instruments  and  0.160  and  20.69  when  just  Stormy  is  used.  However,  given  the 
small  sample,  we  may  still  expect  identification  to  be  weak,  and  weak  identification  is  suggested  by 
the  results  reported  below. 

Quantile  regression  (QR)  and  inverse  quantile  regression  (IVQR)  results  for  the  .15,  .25,  .50, 
.75,  and  .85  quantiles  are  reported  in  columns  (l)-(3)  of  Table  1  below.  Column  (1)  reports  the 
QR  results,  while  columns  (2)  and  (3)  report  IVQR  results.  Columns  (2)  and  (3)  differ  in  that  only 
Stormy  is  used  as  an  instrument  in  Column  (2),  while  Stormy  and  Mixed  are  used  as  instruments  in 
Column  (3).  For  the  t""  quantile,  the  row  labeled  3(r)  gives  the  QR  or  IVQR  estimate  of  a.  The 
row  labeled  "Wald  Interval"  contains  the  95%  confidence  interval  for  S(t)  constructed  based  on  the 
asymptotic  approximation,  and  for  the  IVQR  estimates,  the  row  labeled  "Dual  Interval"  contains 
the  95%  confidence  bound  constructed  using  the  dual  inference  procedure  outlined  in  Section  3.1. 
The  computation  of  the  IVQR  estimator  was  conducted  over  the  parameter  space  A  =  [—5,  5]  using 
Qj  equally  spaced  with  a  step  size  of  0.1. 

The  IVQR  estimates  exhibit  considerable  heterogeneity,  ranging  from  -1.5  to  -0.7  in  column  (2) 
and  from  -1.5  to  -0.9  in  column  (3).  The  IVQR  elasticities  are  also  uniformly  greater  in  magnitude 
than  the  "price  effects"  estimated  by  the  ordinary  QR,  which  we  might  anticipate  given  endogenous 
sampling  resulting  from  the  joint  determination  of  price  and  quantity  in  the  market.  These  differ- 
ences are  illustrated  graphically  in  Figure  1.  The  left  panel  reports  the  QR  results,  and  the  right 
panel  reports  the  IVQR  results  when  both  Stormy  and  Mixed  are  used  as  instruments.  In  the  figure, 
we  clearly  see  that  the  demand  functions  estimated  by  IVQR  are  steeper  than  those  estimated  by  QR 
when  plotted  in  log-price-log-quantity  space,  and  that  this  translates  directly  into  more  curvature 
of  the  demand  curve  when  plotted  in  the  original  price-quantity  space.  It  is  also  important  to  note 
that  the  interpretation  of  IVQR  and  QR  estimates  is  very  different.  IVQR  estimates  a  structural 
demand  model,  while  QR  estimates  the  conditional  quantiles  of  the  equilibrium  quantity  variable  as 
a  function  of  the  equilibrium  price.  It  is  no  surprise  that  these  estimates  are  different. 

Interestingly,  there  are  clear  and  large  differences  between  the  confidence  intervals  given  by  the 
two  different  inference  procedures  for  IVQR.  In  particular,  the  confidence  intervals  based  on  the 
dual  procedure  which  is  robust  to  weak  and  partial  identification  are  uniformly  much  wider  than 
the  intervals  based  on  the  direct  inference  procedure.  For  instance,  the  dual  confidence  region  for 
the  .85  quantile  case  contains  the  upper  endpoint  of  the  parameter  space  A.  In  addition,  it  is  worth 


^More  detailed  arguments  may  be  found  in  Graddy  (1995). 
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Table  1.  Results  from  Empirical  Examples 


Example 

;  1:   Demand  for  Fish 

Example  2: 
QR 

Returns  to  Training 

QR 

IVQR 

IVQR 

IVQR 

(1) 

(2) 

(3) 

(4) 

(5) 

S(.15) 

-0.53 

-1.5 

-1.5 

1188 

-200 

Wald  CI 

(-1.24,0.16) 

(-3.69,-0.69) 

(-2.51,-0.49) 

(553,1822) 

(-1435,1035) 

Dual  CI 

[-5.0,0.5) 

(-3.2,0.1) 

(-1300,1500) 

S(.25) 

-0.40 

-1.0 

-1.4 

2510 

500 

Wald  CI 

(-0.87,0.07) 

(-2.51,0.51) 

(-2.52,-0.28) 

(1742,3278) 

(-887,1887) 

Dual  CI 

(-4.4,0.0) 

(-3.1,0.1) 

(-1000,2000) 

S(.50) 

-0.41 

-0.7 

-0.9 

4420 

300 

Wald  CI 

(-0.81,-0.01) 

(-1.67,0.27) 

(-1.82,0.02) 

(3220,5621) 

(-1589,2189) 

Dual  CI 

(-3.0,0.6) 

(-3.0,0.6) 

(-1400,2700) 

3(.75) 

-0.70 

-1.2 

-1.3 

4678 

2700 

Wald  CI 

(-1.18,-0.22) 

(-2.02,-0.38) 

(-2.07,-0.53) 

(2901,6455) 

(-260,5660) 

Dual  CI 

(-2,0,-0.1) 

(-2.1,0.1) 

(-400,5600) 

S(.85) 

-0.81 

-1.3 

-1.1 

4806 

3200 

Wald  CI 

(-1.24,-0.38) 

(-2.10,-0.50) 

(-1.82,-0.38) 

(2751,6861) 

(32,  6368) 

Dual  CI 

(-2.0,5.0] 

(-2.6,5.0] 

(500,5800) 

Notes:  Columns  (l)-(3)  report  results  from  estimation  of  the  demand  for  fish,  and  columns  (4)  and  (5) 
report  results  from  estimation  of  the  returns  to  training  from  the  JTPA  experiment.  Columns  (1)  and  (4) 
report  conventional  quantile  regression  results,  and  columns  (2),  (3),  and  (5)  report  instrumental  quantile 
regression  results.  In  column  (2),  one  instrument.  Stormy,  is  used,  and  in  column  (3),  two  instruments. 
Stormy  and  Mixed  are  used.  Rows  labeled  a(T)  for  t  G  {.15,  .25,  .50,  .75,  .85}  report  point  estimates,  and 
the  numbers  in  parentheses  are  confidence  intervals. 


noting  that  the  confidence  intervals  obtained  using  two  instrumental  variables  are  generally  shorter 
than  the  confidence  intervals  obtained  using  just  one  instrumental  variable,  suggesting  an  efficiency 
gain  to  using  more  instruments. 

The  construction  and  nature  of  the  dual  confidence  bounds  are  further  illustrated  in  Figures  2 
and  3,  which  respectively  plot  the  IVQR  objective  function  Wn(a)  over  the  parameter  space  in  the 
two  cases,  a  is  plotted  on  the  horizontal  axis,  and  the  vertical  axis  shows  ^^(q).  The  horizontal 
line  in  each  graph  is  the  95%  critical  value  for  the  dual  testing  procedure,  so  all  points  lying  below 
the  horizontal  line  belong  to  the  confidence  region  for  a(T). 

These  graphs  display  a  number  of  interesting  features.  It  is  apparent  that  the  objective  function, 
while  having  numerous  local  minima,  has  a  distinct  minimum  over  A  in  all  cases.    The  objective 
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functions,  and  hence  dual  confidence  regions,  are  generally  well-behaved  in  the  middle  of  the  distri- 
bution and  become  more  erratic  as  one  moves  toward  the  tails  of  the  distribution.  It  is  also  clear 
that  the  dual  confidence  regions  are  not  connected  in  many  cases. 

This  simple  example  clearly  illustrates  the  potential  differences  between  the  direct  and  dual 
inference  procedures.  It  also  provides  an  example  of  an  apphcation  of  the  methods  of  this  paper 
to  demand  analysis  where  the  elasticity  of  demand  is  potentially  heterogeneous.  The  next  example 
illustrates  the  use  of  the  estimator  in  a  setting  with  a  considerably  larger  sample  and  where  identi- 
fication is  much  stronger,  showing  that  in  this  setting  the  two  inference  procedures  produce  similar 
results.  The  results  also  demonstrate  the  interesting  insights  that  may  be  gained  through  quantile 
analysis  and  the  importance  of  accounting  for  endogeneity  in  such  studies. 

5.2.  The  Returns  to  Training 

The  impact  of  job  training  programs  on  the  earnings  of  trainees,  especially  those  with  low  income,  is 
of  great  interest  to  both  policy  makers  and  academic  economists,  but  evaluating  the  causal  effect  of 
training  programs  on  earnings  is  difficult  due  to  the  self-selection  of  treatment  status.  However,  data 
available  from  a  randomized  training  experiment  conducted  under  the  Job  Training  Partnership  Act 
(JTPA)  provides  a  mechanism  for  addressing  this  issue.  In  the  experiment,  people  were  randomly 
assigned  the  offer  of  JTPA  training  services,  but  because  people  were  able  to  refuse  to  participate,  the 
actual  treatment  receipt  was  self-selected.  Of  those  offered  treatment,  only  60  percent  participated 
in  the  training.  There  was  also  a  small  number  of  individuals  from  the  control  group  who  received 
training.  The  random  assignment  of  the  training  offer  provides  a  plausible  instrument  for  a  person's 
actual  training  status.  Abadie,  Angrist,  and  Imbens  (2002),  who  previously  used  this  data  to 
examine  the  impact  of  job  training  on  earnings,^  provide  more  detailed  information  regarding  data 
collection  procedures,  sample  selection  criteria,  and  institutional  details  of  the  JTPA  along  with 
additional  facts  and  discussion  about  the  JTPA  training  experiment.  In  this  example,  we  limit  the 
analysis  to  the  sample  of  adult  males. 

To  capture  the  effects  of  training  on  earnings,  we  estimate  a  structural  quantile  model  of  the 
form 

Y  =  Da[U)  +  X'0{U),    U  ~  (7(0, 1),  given    Z  and  X, 

where  D  indicates  training  status  and  is  instrumented  for  by  assignment  to  the  treatment  group,  the 
outcomes  Y  are  earnings,  X  is  a  vector  of  covariates,  Z  is  a  dummy  variable  indicating  assignment 
to  the  treatment  group,  and  U  is  an  unobservable  affecting  earnings. 


They  use  a  different  modeling  framework  and  estimator  that  estimates  the  treatment  effect  for  the 
sub-population  of  "compliers". 
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The  data  consist  of  5,102  observations  with  data  on  earnings,  training  and  assignment  status, 
and  other  individual  characteristics.  Earnings  are  measured  as  total  earnings  over  the  30  month 
period  following  the  assignment  into  the  treatment  or  control  group,  and  average  earnings  in  the 
sample  are  $19,147,  The  vector  of  controls,  X ,  includes  dummies  for  black  and  Hispanic  persons,  a 
dummy  indicating  high-school  graduates  and  GED  holders,  five  age-group  dummies,  a  marital  status 
dummy,  a  dummy  indicating  whether  the  applicant  worked  12  or  more  weeks  in  the  12  months  prior 
to  the  assignment,  a  dummy  signifying  that  earnings  data  are  from  a  second  follow-up  survey,  and 
dummies  for  the  recommended  service  strategy/  For  brevity,  we  only  report  results  for  estimates  of 
the  key  parameter,  q(t),  which  represents  the  impact  of  the  training  program  on  earnings. 

Since  assignment  to  the  treatment  or  control  group  was  random,  it  provides  a  natural  instrument 
Z.  The  instrument  is  useful  for  identification  since  it  is  highly  correlated  to  the  actual  training 
state  D.  The  partial  R^  of  a  regression  of  training  status  on  assignment  to  the  treatment  group, 
controlling  for  the  other  covariates,  is  .609,  and  the  first-stage  F-statistic  is  2,673,  This  strong 
correlation  suggests  that  weak  identification  should  not  be  a  problem  in  this  case,  so  the  direct  and 
dual  inference  procedures  should  yield  similar  results. 

As  in  the  previous  example,  estimation  results  for  the  ,15,  .25,  .50,  .75,  and  .85  quantiles  are 
reported  in  columns  (4)-(5)  of  Table  1.  Column  (4)  reports  the  QR  results,  while  column  (5)  reports 
the  IVQR  results.  For  the  t"*  quantile,  the  row  labeled  S(r)  gives  the  QR  or  IVQR  estimate  of  a. 
The  row  labeled  "Wald  Interval"  contains  the  95%  confidence  interval  for  S(t)  constructed  based  on 
the  asymptotic  approximation,  and  for  the  IVQR  estimates,  the  row  labeled  "Dual  Interval"  contains 
the  95%  confidence  bound  constructed  using  the  dual  inference  procedure  outlined  in  Section  3.1, 
The  computation  of  IVQR  was  conducted  over  the  parameter  space  A  =  [—2500,7500]  using  aj 
equally  spaced  with  a  step  size  of  100.  In  addition,  the  results  are  presented  graphically  in  Figure  4. 

The  conventional  QR  results,  which  fail  to  account  for  the  selection  into  the  treatment  state, 
are  uniformly  positive  and  significantly  different  from  0.  They  indicate  that  the  training  program 
had  a  relatively  large  impact  on  the  earnings  of  participants,  and  that  this  impact  is  increasing  in 
the  quantile  index.  However,  given  that  people  were  able  to  decide  whether  or  not  to  participate 
in  training  following  the  initial  random  assignment,  it  seems  likely  that  these  estimates  would  be 
upward  biased  for  the  actual  effect  of  training  on  earnings.  This  suspicion  is  confirmed  by  the  IVQR 
estimates,  which  account  for  the  endogeneity  of  training  status  and  are  uniformly  smaller  than 
the  corresponding  QR  estimates.  This  difference  is  most  apparent  in  the  low  and  middle  earning 
quantiles.  In  the  low  quantiles,  QR  suggests  a  moderate  positive  and  significant  effect  of  training  on 
earning  quantiles;  however,  the  IVQR  estimates  are  quite  low  and,  while  imprecise,  not  significantly 


The  recommended  service  strategy  was  broken  into  three  categories;    classroom  training,  on-the-job 
training  and/or  job  search  assistance,  and  other  forms  of  training. 
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different  from  0.  The  difference  in  the  estimates  becomes  more  apparent  when  we  consider  the 
percentage  impact  of  the  training  program,  which  is  presented  in  the  right  hand  column  of  Figure 
4.*  Here,  the  QR  estimates  imply  a  large  percentage  increase  in  earnings  in  the  low  earning  quantiles, 
starting  at  139%  for  r  =  .15,  which  declines  as  one  moves  to  the  upper  quantiles  of  the  conditional 
earnings  distribution,  though  the  impact  remains  large  even  in  the  center  of  the  distribution  at 
T  =  .50,  where  the  implied  effect  is  a  35%  increase  in  earnings  due  to  training.  The  IVQR  estimates 
on  the  other  hand  are  quite  stable,  varying  between  -13%  and  14%,  and  with  the  exception  of 
T  =  .25  are  all  below  10%. 

Unlike  the  case  considered  above,  we  do  not  find  large  differences  between  the  direct  and  dual 
inference  procedures  for  IVQR  in  this  case.  The  similarity  between  the  two  approaches  is  not 
unexpected  due  to  the  strong  correlation  between  the  instrument  and  endogenous  regressor.  The 
close  agreement  here  further  suggests  that  not  much  is  lost  by  considering  the  dual  procedure  in 
cases  where  identification  is  strong.  It  also  provides  further  support  for  the  argument  that  the 
differences  detected  in  the  previous  section  are  due  to  weak  identification.  Given  the  robustness  of 
the  dual  procedure  to  the  presence  of  weak  instruments  and  its  simple  computation,  it  seems  that 
this  inference  procedure  will  be  preferable  to  the  standard  procedure  in  many  cases. 

The  dual  confidence  bounds  are  further  illustrated  in  Figure  5,  which  plots  the  IVQR  objective 
function  W„(q)  over  the  parameter  space  A.  a  is  plotted  on  the  horizontal  axis,  and  the  vertical 
axis  shows  W„(q).  The  horizontal  line  in  each  graph  is  the  95%  critical  value  for  the  dual  testing 
procedure,  so  all  points  lying  below  the  horizontal  line  belong  to  the  confidence  region  for  q(t).  The 
graphs  in  Figure  3  differ  markedly  from  those  in  Figures  1  and  2.  In  particular,  all  of  the  objective 
functions,  and  hence  confidence  regions,  in  Figure  3  look  remarkably  well-behaved.  The  objective 
functions  appear  to  be  reasonably  smooth,  and  the  confidence  intervals  are  all  connected  and  clearly 
bounded  within  the  parameter  space  considered. 

6.  Conclusion 

In  this  paper,  we  propose  an  estimation  approach,  the  inverse  quantile  regression  (IVQR),  that 
appropriately  modifies  the  conventional  quantile  regression  (QR)  and  recovers  quantile-specific  co- 
variate  effects  in  an  instrumental  variables  model  defined  by  y  =  D'a(U)  where  U  is  independent 
of  a  set  of  instrumental  variables  Z.  The  IVQR  estimator  is  appealing  for  estimation  in  this  model 
since  it  can  be  computed  through  a  series  of  conventional  quantile  regression  steps  and  so  will  be 
computationally  convenient  in  many  cases  encountered  in  practice.  We  derive  the  asymptotic  prop- 
erties of  the  estimator  under  suitable  conditions.  In  addition,  we  demonstrate  that  the  estimation 


^The  percentage  impact  is  for  changing  from  D  =  0  to  D  =  1,  i.e.  from  the  non-training  to  the  training 
state.  All  other  covariates  were  evaluated  at  their  sample  means. 
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procedure  leads  to  a  testing  procedure  which  will  be  robust  to  the  presence  of  weak  instruments 
and  that  this  inference  procedure  results  naturally  from  the  IVQR  algorithm  and  so  is  simple  to 
implement  in  practice. 

We  then  illustrate  the  use  of  the  proposed  estimator  and  testing  procedure  through  two  brief 
empirical  examples.  In  the  first  example,  we  examine  a  simple  demand  model  in  a  small  sample  with 
relatively  weak  instruments.  In  this  case,  we  find  that  the  conventional  QR  estimate  of  the  elasticity 
of  demand  appears  to  be  upward  biased  as  would  be  expected  due  to  the  joint  determination  of 
price  and  quantity  by  supply  and  demand.  In  addition,  we  find  that  there  are  large  differences 
between  the  direct  inference  procedure  and  the  dual  inference  procedure  which  is  robust  to  weak 
instruments.  In  the  second  example,  we  look  at  the  impact  of  a  job  training  program  on  earnings.  In 
this  case,  we  instrument  for  training  status  with  random  assignment  to  the  training  program  which 
is  very  highly  correlated  to  actual  receipt  of  training.  In  this  case,  there  is  essentially  no  difference 
between  the  direct  inference  procedure  and  the  dual  procedure  which  is  robust  to  weak  instruments. 
In  addition,  there  is  strong  evidence  of  endogeneity  of  training  status  resulting  in  substantial  bias  to 
the  conventional  QR  estimator.  This  bias  is  especially  pronounced  in  the  lower  tail  of  the  earnings 
distribution  where  QR  suggests  a  significant  and  positive  effect  of  training  on  earnings,  while  the 
IVQR  estimates  are  insignificant  and  small  in  magnitude. 
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7.  Appendix 

7.1.  Proof  of  Proposition  1 

The  result  (4.3)  follows  by  adopting  standard  arguments  for  quantile  regression  processes,  for  in- 
stance those  of  Gutenbrunner  and  Jureckova  (1992).  The  rest  of  the  stated  conclusions  (4.4)-(4.10) 
follow  by  the  Slutsky  Lemma. 

7.2.  Proof  of  Proposition  2 

We  use  standard  definitions  and  notation  from  empirical  process  theory,  as  e.g.  van  der  Vaart  and 
Wellner  (1996)  and  van  der  Vaart  (1998).  For  W  :=  {Y,D,X,Z),  define  the  maps 


(fm)~E[f{W.)]), 


(7.1)  /^E„[/(iy)]:=if^/(W^,),    f^Gnlf{W)]:=j=f2 

where  we  use  E  to  denote  the  usual  expectation  and  E  to  denote  expectation  evaluated  at  an 
estimated  function  /:  E[/(W,:)]  :=  {E[f{Wi)])j^jr. 

For  convenience  we  collect  important  definitions  below.   Let  ■§  :=  (/3,7)  and  i9(t)  :=  (/3(t),0). 
Define 

(7.2)  f{W,a,^)  :=  {t-1{Y  <  D'q  +  X'/?  +  Z'7)}*, 
where  *  :=  V  •  {X',  Z')'.  Define  for  pr{u)  =  (r  -  l(u  <  0))u 

(7.3)  g{W,a,^)  :=  pr{Y  -  D'a  -  X'f5  -  Z'i)V. 
Define 

(7.4)  Qn{aJ):=E,,[g{W,a,d)],    Q(a,i))  :=  E  [giW,a,^)] , 
and 

(7.5)  ■d{a,T)  :=  (^(q,  r),  7(q,t))  :=     arginf^g^dimc^.^)  Qn(a,'i5), 

(7.6)  i?(q,  r)  :=  (/?(q,  t),  7(a,  t))  :=     arg  inf^jgRdi^c^.^)  Q  (a,  i?), 

(7.7)  Wn[a]:=^{a,TYA{a)j(a,T),      W[q]  :=  7(Q,T)'^(a)7(Q,  r), 

(7.8)  3(r)    :=arginfag^  VKnM,      a'  :=  arginf^g^  W[q], 

(7.9)  ^(T):=^(3(r),T),      0':=p{a',T),    7(r)  :=  7(S(t),t),      7':=7(a*,r). 

Step  1  (Identification)  We  show  that  7?(t)  =  (Q(r)',/?(r)')  uniquely  solves  the  limit  problem. 

First,  by  R6,  the  mapping  (a,P)  ^  E[{t  -  1(Y  <  D'a  +  X'/?)}*]  is  one-to-one  over  AxB.  By 
equation  (2.5),  we  have  that  i?(t)  =  (a(T)', /3(t)')' solves  the  equation  £  |{r  -  1{Y  <  D'a  +  X'0)}'i!]  = 
0,  and  it  is  thus  the  only  solution  over  AxB. 
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Second,  we  need  to  show  that  R5'-R6'  and  R5*-R6*  suffice  for  R6.  Sufficiency  of  R5'-R6' 
follows  by  a  variant  of  Hadamard-Cacciopoli  theorem  for  general  metric  spaces,  cf.  Theorem  1.8  in 
Ambrosetti  and  Prodi  (1995).  Sufficiency  of  R5*-R6*  follows  by  Theorem  2  in  Mas-Colell  (1979). 

Third,  we  have  that  the  true  parameters  {a,/3)  =  {a(T),P{T))  uniquely  solve  the  equation 

(7.10)  E  [{t  -  1{Y  <  D'a  +  X'P  +  Z'0)}<1/]  =  0 

over  A  y<  B.  By  R4  and  by  convexity  in  i?  of  the  limit  optimization  problem  for  each  q,  i9(a,T) 
uniquely  solves  the  equation: 

(7.11)  EI{t-  1{Y  <  D'a  +  X'P{a,T)  +  Z''y{a,T)})<b]  =  0. 

By  construction  of  ^  x  B  we  know  that  /3(q,  r)  is  in  the  interior  of  B  for  each  a  £  A.  We  need  to 
find  a'  €  A  such  that  this  equation  holds  and  the  norm  of  7(0*, t)  is  minimal,  a'  =  Q(r)  makes 
7(Q*,r)  =  0  by  equation  (2.5).  Thus  a*  =  Q(r)  is  a  solution;  by  the  preceding  argument  it  is  unique 
and  /?(Q*(T),r)  =  0{t). 

Step  2.   (Consistency)  One  consequence  of  Proposition  1,  namely  of  equation  (4.3),  is  that 

(7.12)  sup  ||t?(Q,r) -i9(Q,r)||-^p  Oi.e.      sup  ||7(q,  r)  -  7(a,r)||-»p  0, 

which  implies  supQg_4  |iyn(Q)  -  iy(Q)|||  — >  0,  where  W{a)  is  continuous  in  a  over  A.  It  therefore 
follows  by  the  standard  consistency  argument  for  extremum  estimators  that  3(t)— »p  Q(r),  and  then 
by  (7.12)  that  for  any  Q^^p  a(r),  P{an,T)  -^p  P{a(T),T)  =  /?(r)  and  7(an,T)-*p  7(a(T),T)  = 
7(t)  =  0.  Hence  we  also  have  that 

(7.13)  i?(Q„,r)->p  ■d{a{T),T)  for  any  »„-»?  a(r). 

Note  that  above  we  have  used  that  'd{a,  r)  is  continuous  in  q,  which  is  verified  by  the  implicit 
function  theorem  applied  to  equation  (7.11). 

Step  3.  (Asymptotics)  Let  a^  be  in  a  small  ball  centered  at  a{T).  By  the  computational 
properties  of  the  quantile  regression  estimator  'd{an,T)  established  in  Theorem  3.3  in  Koenker  and 
Bassett  (1978), 

(7.14)  Oil/V^)  =  V^En[fiW,an,d{an,T))]. 

The  functional  class  {f{W,a,'d),  {a,i})  £  A  x  B  x  Q}  is  Donsker  for  any  compact  sets  A,  B,  and  Q, 
because  this  class  is  a  product  of  a  VC  subgraph  class  and  a  bounded  random  vector.  Hence  the 
following  expansion  of  the  rhs  of  (7.14)  is  valid  for  any  On—^p  a(r) 


0{1/M  =  G„  [/  (W,  an,  d{a{T),  r))]  +  v^E  [/  (w,  q„,  ,?(q„  r))] 

=  Gn  [/(M/,Q(T),i?(a(r),T))]  +  V^E  [f  (iy,Q,,,?(Q„,  r))]  +  Op(l). 
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Expanding  the  very  last  element  further,  by  R4 

Oil/y/H)  =  G„  [/(H/,  a{T),^T))\  +  Op(l) 
(7.16) 

+  (J^  +  Op(l))v^(j?(Q„,T)  -  d{T))  +  (J„  +  Op(l))v^(a„  -  a(T)), 

where  by  Rl  and  R3 

0(P',7')  M(.y./J)  =  (0,/3{r)) 

(7.17) 

oa'  la=Q(T) 

=  £[A(0|X,Z,D)*D']. 

In  other  words,  for  any  Q^— *p  cx{t) 

Md{an,T)  -  d{T))  =  -  J-'Gn  [/( VK,  o(r),  79(t))] 
(7.18) 

-  Ja  ^  Ja[l  +  Op(l)Jv^(a„  -  Q(r))  +  Op(l), 

so 

v/^(^(Q„,r)  -  /3(t))  =  -  J^Gn  [f{W,a{T),d{T))] 
(7.19) 

-  J^Ja[l  +  Op(l)]v^(Q„  -  a{T))  +  Op(l), 

and 

v^(7(Qn,  r)  -  0)  =  -  J.,G„  [/(ly,  a(r),  7?(t))] 
(7.20) 

-  J-,  Jq[1  +  Op(l)]\/n(an  -  a(T))  +  Op(l), 

where  [J'^,]'^]'  is  the  conformable  partition  of  J^^ ■ 

Center  a  shrinking  closed  ball  Bn  at  0,  so  that  by  consistency  obtained  in  Step  2,  an  —  air)  G  B^ 
wp  — »  1.  Then   wp  — >  1 
(7  211  S(r)  =       arginf     iy„(Qn). 

^    ■  a„-a(T)6B„ 

Note  that  G„  [f(W,  Q(r),  i9(r))]  ^d  A^(0,  5)  by  the  Central  Limit  Theorem.  Hence  G„  [f[W,  a(r),  -dir))] 
Op(l),  and 

Wn{an)  =  [Op(l)  -  J-,Ja[l  +Op(l)]V^(a„  -Q(r))]' 

(7.22)  x[yl(Q0  +  Op(l)] 

X  [Op(l)  -  J-,J„[1  +  Op(l)]  X  v^(a„  -  Q(r))]  . 

It  then  follows  from  (7.21)  and  (7.22)  that  y/n{a.[T)  —  Q:(r))  =  Op(l)  since  J^Ja  has  full  column 
rank  and  A{a)  has  full  rank  from  R4  and  R5.  Thus,  we  have  that 

(7.23)  v/^(d(r)  -  Q(r))  =  arg      inf      [Qn(2)  +  Op(l)] , 

where  Q„(z)  :=  (  -  J^Gn!{W,a{T),^[T))  -  J^J^z)' A{a){-  J^Gnf{W,a{T),d{T))  -  J-,J^z). 
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LEMMA  1  {Approximate  Argmins,  Knight  (1999)).  Define  Zn  such  that  Qn[Zn)  <  inf^gRd  Qn{i)  + 
(■n,  in  \  0,  and  defined  Z*  as  arginf^gK.^  (5„(?).  Suppose  that  Zn  =  Op(l),  Z^  =  Op(l),  Z^o  '■= 
a,rgmm^^jg^dQx{z)  is  uniquely  defined  in  R'^  a.s.,  and  Qn(-)  =>  Qoo(')  in  i°°{K)  over  any  compact 
sets  K ,  where  Q^c  is  continuous.   Then  Zn  =  Z^  +  Op(l)  and  Zn—>d  Zoo- 

Apply  Lemma  1  to  Qn{z)  defined  above  and  conclude  that 

(7.24)  ^^(3(r)  -  Q(r))  =  arg      inf      [Qn{z)]  +  o^{l), 

that  is 

25)  "^^"^^^ "  "^^^^^ " "  (■^^•^7^('^w)  A^) "'  (i;j;^(a(T))  a) 

xG„[/(W^,a(r),,9(r),r)]+0p(l), 
Hence 
(7  26)  ^{kHr),T)  ~d[T))  =  -J-'[l  -  J„(j'J'^A{a{T))J,J^Y' J'J'^A{a{T))J, 

xG,[/(H/,Q(r),,?(r),T)]+Op(l) 
The  conclusion  of  Proposition  2  follows  from  G„  (/(VK,q(t),'!9(t))]  -^d  N{0,  S). 
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Figure  1.  Estimates  of  Effect  of  Price  on  Quantity  by  QR  and  IVQR 
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Note:  Left  Column:  The  estimated  conditional  quantile  curves  of  the  quantity  of  fish  sold 
as  a  function  of  price  for  r  =  .15,  .25,  .50,  .75,  and  .85.  The  top  display  is  in  log-price 
log-quantity  space  with  log-price  on  the  horizontal  axis  and  log-quantity  on  the  vertical 
axis.  The  bottom  display  is  in  price-quantity  space  with  price  on  the  horizontal  axis  and 
quantity  on  the  vertical  axis.  Right  Column:  The  demand  curves  estimated  by  IVQR  for 
r  =  .15,  .25,  .50,  .75,  and  .85.  The  top  display  is  in  log-price  log-quantity  space  with 
log-price  on  the  horizontal  axis  and  log-quantity  on  the  vertical  axis.  The  bottom  display 
is  in  price-quantity  space  with  price  on  the  horizontal  axis  and  quantity  on  the  vertical 
axis. 
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Figure  2.   Statistic  Wn{a)  in  Demand  Example  Using  Stormy  as  an  Instrument 


A 

\ 

/ 

\^ 

/'-  / 

', 

1^ 

- y^J 

/ 

' 

K   y  . 

I  =50 


Note:  Objective  functions  and  dual  confidence  regions  for  demand  for  fish  example.  All 
models  are  as  specified  in  the  main  text.  The  estimates  make  use  of  one  instrument, 
Stormy,  a  is  on  the  horizontal  axis  and  Wn(,a)  is  on  the  vertical  axis.  The  horizontal  line 
is  the  95%  critical  value  from  a  Xi-  The  dual  confidence  region  is  all  values  of  a  such 
that  the  Wn{oi)  lies  below  the  horizontal  line. 
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Figure  3.  Statistic  W„{a)  in  Demand  Example  Using  Stormy  and  Mixed  as  Instruments 
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Note:  Objective  functions  and  dual  confidence  regions  for  demand  for  fish  example.  All 
models  are  as  specified  in  the  main  text.  The  estimates  make  use  of  two  instruments, 
Stormy  and  Mixed,  a  is  on  the  horizontal  axis  and  W„{a)  is  on  the  vertical  axis.  The 
horizontal  line  is  the  95%  critical  value  from  a  xi-  The  dual  confidence  region  is  all 
values  of  a  such  that  W„(a)  lies  below  the  horizontal  line. 
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Figure  4.   Estimates  of  the  Training  Impact  by  QR  and  by  IVQR 
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IVQR:  Training  Effect 


IVQR;  Percentage  Impact  of  Training 


02       0,3       0.4       0,5       0,6       0,7       0 


Note:  Left  Column:  QR  and  IVQR  estimates  of  the  impact  of  a  job  training  program  on 
earnings  for  t  =  .15,  .25,  .50,  .75,  and  .85.  The  top  panel  reports  the  QR  estimate  of  the 
training  impact,  and  the  bottom  panel  reports  the  IVQR  results.  In  each  figure,  the  solid 
line  represents  the  point  estimates,  and  the  dashed  (-  -)  line  represents  the  95% 
confidence  interval  formed  using  the  direct  inference  approach.  For  the  IVQR  results,  the 
dash-dot  (-.)  line  represents  the  95%  confidence  bound  constructed  using  the  dual 
inference  procedure  described  in  the  text.  In  both  figures,  the  horizontal  axis  measures 
the  quantile  index  r,  and  the  vertical  axis  is  the  impact  of  training  on  earning  quantiles 
measured  in  dollars.  Models  include  covariates  as  specified  in  the  text,  and  the  sample 
size  is  5,102.  Right  Column:  QR  and  IVQR  estimates  of  the  percentage  impact  of 
training  for  r  =  .15,  .25,  .50,  .75,  and  .85.  The  top  panel  reports  the  QR  estimate  of  the 
training  impact,  and  the  bottom  panel  reports  the  IVQR  results.  Percentage  impacts  are 
for  moving  from  non-training  to  training  and  all  other  covariates  are  evaluated  at  their 
sample  mean.  In  both  figures,  the  horizontal  axis  measures  the  quantile  index  r,  and  the 
vertical  axis  is  the  percentage  impact  of  training. 
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Figure  5.   Statistic  Wn{cr)  in  the  Training  Example. 


200 

150 
100 
50 


t  = 

.15 

/■ 

/ 

/ 

:::i=^_ 

__^- 

-^^ 

T=.25 


-2000 


2000   4000   6000 


2000   4000   6000 


\ 

A 

/ 

\ 

/ 

\ 

r 

\ 

y 

■^•^ 

,-r^ 

2000    4000    6000 


-2000 


2000   4000   6000 


2000   4000   6000 


Note:  Objective  functions  and  dual  confidence  regions  for  returns  to  training  example. 
All  models  are  as  specified  in  the  main  text.  The  estimates  use  random  assignment  to 
the  training  program  as  the  instrument,  a  is  on  the  horizontal  axis  and  Wn(c()  is  on  the 
vertical  axis.  The  horizontal  line  is  the  95%  critical  value  from  a  X\  ■  The  dual  confidence 
region  is  all  values  of  a  such  that  the  function  value  lies  below  the  horizontal  line. 
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